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SEQUENCE LISTING 

4 (1) GENERAL INFORMATION: 

6 (i) APPLICANT: Suerbaum, Sebastian 

7 Labigne, Agnes 

9 (ii) TITLE OF INVENTION: Cloning and Characterization of the flbA 

10 Gene of H. Pylori, Production of Aflagellate Strains 

12 (iii) NUMBER OF SEQUENCES: 13 

14 (iv) CORRESPONDENCE ADDRESS: 

15 (A) ADDRESSEE: Finnegan, Henderson, Farabow, Garrett & 

16 Dunner 

17 (B) STREET: 1300 I Street, N.W. 

18 (C) CITY: Washington 

19 (D) STATE: D.C. 

20 (E) COUNTRY: USA 

21 (F) ZIP: 20005-3315 
2 3 (V) COMPUTER READABLE FORM: 
24 (A) MEDIUM TYPE: Floppy disk 
2 5 (B) COMPUTER: IBM PC compatible 
2 6 (C) OPERATING SYSTEM: PC-DOS/MS -DOS 
27 (D) SOFTWARE: Patentln Release #1.0, Version #1.30 
29 (vi) CURRENT APPLICATION DATA: 

C--> 30 (A) APPLICATION NUMBER: US/09/015,078 

C--> 31 (B) FILING DATE: 29-Jan-1998 

32 (C) CLASSIFICATION: 

34 (viii) ATTORNEY/AGENT INFORMATION: 

35 (A) NAME: Meyers, Kenneth J. 

36 (B) REGISTRATION NUMBER: 25,146 

37 (C) REFERENCE/DOCKET NUMBER: 02356.0073-01000 

39 (ix) TELECOMMUNICATION INFORMATION: 

40 (A) TELEPHONE: (202) 408-4000 

41 (B) TELEFAX: (202) 408-4400 
44 (2) INFORMATION FOR SEQ ID NO : 1: 

46 (i) SEQUENCE CHARACTERISTICS: 

47 (A) LENGTH: 19 base pairs 

48 (B) TYPE: nucleic acid 

49 (C) STRANDEDNESS : single 

50 (D) TOPOLOGY: linear 
52 (ii) MOLECULE TYPE: DNA (genomic) 
57 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

59 ATGCCNGGNA AAGCARATG 19 

61 (2) INFORMATION FOR SEQ ID NO : 2: 

6 3 (i) SEQUENCE CHARACTERISTICS: 

64 (A) LENGTH: 18 base pairs + 
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65 (B) TYPE: nucleic acid 

6 6 (C) STRANDEDNESS : single 

6 7 (D) TOPOLOGY: linear 

69 (ii) MOLECULE TYPE: DNA (genomic) 

74 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

76 RAAYTTCATN GCNCCRTC 18 
78 (2) INFORMATION FOR SEQ ID NO : 3: 

80 (i) SEQUENCE CHARACTERISTICS: 

81 (A) LENGTH: 135 base pairs 

82 (B) TYPE: nucleic acid 

83 (C) STRANDEDNESS: single 

84 (D) TOPOLOGY: linear 

86 (ii) MOLECULE TYPE: DNA (genomic) 

91 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

93 ATGCCAGGAA AGCAAATGGC GATTGATGCG GATTTAAATT CAGGGCTTAT TGATGATAAG 60 
95 GAAGCTAAAA AACGGCGCGC CGCTCTAAGC CAAGAAGCGG ATTTTTATGG TGCGATGGAT 120 
97 GGCGCGTCTA AATTT 135 
99 (2) INFORMATION FOR SEQ ID NO : 4: 

101 (i) SEQUENCE CHARACTERISTICS: 

102 (A) LENGTH: 28 base pairs 

103 (B) TYPE: nucleic acid 

104 (C) STRANDEDNESS: single 

105 (D) TOPOLOGY: linear 

107 (ii) MOLECULE TYPE: DNA (genomic) 

112 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

114 CGGGATCCGT GGTTACTAAT GGTTCTAC 2 8 

116 (2) INFORMATION FOR SEQ ID NO : 5: 

118 (i) SEQUENCE CHARACTERISTICS: 

119 (A) LENGTH: 28 base pairs 

120 (B) TYPE: nucleic acid 

121 (C) STRANDEDNESS: single 

122 (D) TOPOLOGY: linear 

124 (ii) MOLECULE TYPE: DNA (genomic) 

129 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 

131 CGGGATCCTC ATGGCCTCTT CAGAGACC 28 
133 (2) INFORMATION FOR SEQ ID NO : 6: 

135 (i) SEQUENCE CHARACTERISTICS: 

136 (A) LENGTH: 2501 base pairs 

137 (B) TYPE: nucleic acid 

138 (C) STRANDEDNESS: single 

139 (D) TOPOLOGY: linear 

141 (ii) MOLECULE TYPE: DNA (genomic) 

14 6 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 

14 8 AGCTTTTTTG TGCCATACTT TTAAACTTTA TATTATAATA AGAGACAAAC ACACCTACCA 6 0 

150 AAATTAAGGC ATTGATTTTA GATTATGGCA AACGAACGCT CCAAATTAGC TTTTAAAAAG 120 
152 ACTTTCCCTG TCTTTAAACG CTTCTTGCAA TCCAAAGACT TAGCCCTTGT GGTCTTTGTG 180 
154 ATAGCGATTT TAGCGATCAT TATCGTGCCG TTACCGCCTT TTGTGTTGGA TTTTTTACTC 24 0 

156 ACGATTTCTA TCGCGCTATC GGTGTTGATT ATTTTAATCG GGCTTTATAT TGACAAACCG 300 
158 ACTGATTTTA GCGCTTTCCC CACTTTATTA CTCATTGTAA CCTTATACCG CTTGGCTTTA 360 
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160 AATGTCGCCA CCACTAGAAT GATTTTAACC CAAGGCTATA AAGGGCCTAG CGCGGTGAGC 

162 ATTATTATCA CGGCGTTTGG GGAATTTAGC GTGAGCGGGA ATTATGTGAT TGGGGCTATT 

164 ATCTTTAGTA TTTTAGTGCT GGTGAATTTA TTAGTGGTTA CTAATGGTTC TACTAGGGTT 

166 ACTGAAGTTA GGGCGCGATT TGCCCTAGAC GCTATGCCAG GAAAGCAAAT GGCGATTGAT 

168 GCGGATTTAA ATTCAGGGCT TATTGATGAT AAGGAAGCTA AAAAACGGCG CGCCGCTCTA 

170 AGCCAAGAAG CGGATTTTTA TGGTGCGATG GATGGCGCGT CTAAATTTGT CAAAGGCGAT 

172 GCGATCGCTT CTATCATTAT CACGCTTATC AATATCATTG GGGGTTTTTT AGTGGGCGTG 

174 TTCCAAAGGG ATATGAGCTT GAGCTTTAGT GCTAGCACTT TCACTATCTT AACCATTGGC 

176 GATGGGCTTG TAGGGCAAAT CCCTGCCTTA ATCATTGCGA CACGGACCGG TATTGTCGCC 

178 ACTCGCACCA CGCAAAACGA AGAAGAGGAC TTTGCTTCTA AGCTCATCAC ACAGCTCACC 

180 AATAAAAGCA AAACTTTAGT GATTGTGGGG GCGATTTATT GCTTTTGCAC CATTCCTGGA 

182 CTCCCTACCT TTTCTTTAGC GTTTGTAGGG GCTCTCTTTT TATTCATCGC ATGGCTGATT 

184 AGCAGGGAGG GAAAGGACGG GTTGCTCACT AAATTAGAAA ATTATTTGAG TCAAAAATTC 

186 GGCTTGGATT TGAGCGAAAA ACCCCACAGC TCCAAAATCA AACCCCACGC CCCCACCACA 

188 AGGGCTAAAA CCCAAGAAGA GATTAAAAGA GAAGAAGAGC AAGCCATTGA TGAAGTGTTA 

190 AAAATTGAAT TTTTAGAATT GGCTTTAGGC TATCAGCTCT ACAGCTTAGC GGACATGAAA 

192 CAAGGGGGCG ATTTGTTAGA AAGGATTAGG GGTATTAGAA AAAAGATAGC GAGCGATTAT 

194 GGTTTTTTGA TGCCTCAAAT TAGGATTAGG GATAATTTAC AACTCCCCCC AACGCATTAT 

196 GAAATCAAGC TTAAGGGCAT TGTGATTGGT GAAGGCATGG TGATGCCGGA TAAGTTTTTA 

198 GCCATGAATA CCGGTTTTGT GAATAAAGAA ATTGAAGGCA TTCCTACTAA AGAGCCGGCT 

200 TTTGGAATGG ACGCTTTATG GATTGAAACT AAAAATAAAG AAGAAGCCAT CATTCAAGGC 

202 TATACCATTA TTGATCCAAG CACCGTTATT GCGACGCACA CCAGCGAATT AGTGAAAAAA 

204 TACGCTGAAG ATTTTATCAC TAAAGATGAA GTGAAATCCC TTTTAGAGCG CTTGGCCAAA 

206 GACTATCCTA CGATTGTAGA AGAGAGTAAA AAAATCCCCA CCGGTGCGAT CCGATCAGTC 

208 TTGCAAGCCT TGTTGCATGA AAAAATCCCC ATTAAAGACA TGCTCACTAT TTTAGAAACG 

210 ATTACCGATA TTGCGCCATT AGTTCAAAAC GATGTGAATA TCTTAACCGA ACAAGTGAGG 

212 GCGAGGCTTT CTAGGGTGAT CACTAACGCT TTTAAATCTG AAGACGGGCG TTTGAAATTT 

214 TTAACCTTTT CTACCGATAG CGAACAATTT TTGCTTAATA AATTGCGAGA AAATGGCACT 

216 TCTAAGAGCC TACTACTCAA TGTGGGCGAA TTGCAAAAAC TCATTGAAGC GGTCTCTGAA 

218 GAGGCCATGA AAGTCTTGCA AAAAGGGATC GCTCCGGTGA TTTTGATCGT AGAGCCTAAT 

220 TTAAGAAAAG CCCTTTCTAA TCAAATGGAG CAGGCTAGGA TTGATGTAAT CGTGCTAAGC 

222 CATGCTGAAT TAGATCCTAA CTCTAATTTT GAAGCCTTAG GCACGATCCA TATTAACTTT 

224 TAAGGGATAA ATAATTGATA AAAAAGGAGA ATGATGCAAG TTTATCACCT TTCACACATT 

226 GATTTAGACG GCTATGCATG CCAGCTTGTT TCAAAACAAT TTTTTAAAAA TATCCAATGC 

228 TATAACGCTA ATTACGGGCG TGAAGTCTCA GCGAGAATTT ATGAGATTTT AAACGCGATC 

2 30 GCTCAATCTA AAGAGAGTGA ATTCCTTATT TTGATTAGCG A 
2 32 (2) INFORMATION FOR SEQ ID NO : 7: 
2 34 (i) SEQUENCE CHARACTERISTICS: 

235 (A) LENGTH: 732 amino acids 

236 (B) TYPE: amino acid 

237 (C) STRANDEDNESS: single 
2 38 (D) TOPOLOGY: linear 
240 (ii) MOLECULE TYPE: peptide 

24 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 

24 7 Met Ala Asn Glu Arg Ser Lys Leu Ala Phe Lys Lys Thr Phe Pro " 

248 15 10 15 

2 50 Phe Lys Arg Phe Leu Gin Ser Lys Asp Leu Ala Leu Val Val Phe ' 

251 20 25 30 

253 He Ala He Leu Ala He He He Val Pro Leu Pro Pro Phe Val : 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2501 
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254 35 40 45 

256 Asp Phe Leu Leu Thr lie Ser lie Ala Leu Ser Val Leu lie lie Leu 

257 50 55 60 

259 lie Gly Leu Tyr lie Asp Lys Pro Thr Asp Phe Ser Ala Phe Pro Thr 

260 65 70 75 80 

262 Leu Leu Leu lie Val Thr Leu Tyr Arg Leu Ala Leu Asn Val Ala Thr 

263 85 90 95 

265 Thr Arg Met lie Leu Thr Gin Gly Tyr Lys Gly Pro Ser Ala Val Ser 

266 100 105 110 

268 lie lie lie Thr Ala Phe Gly Glu Phe Ser Val Ser Gly Asn Tyr Val 

269 115 120 125 

271 lie Gly Ala lie lie Phe Ser lie Leu Val Leu Val Asn Leu Leu Val 

272 130 135 140 

274 Val Thr Asn Gly Ser Thr Arg Val Thr Glu Val Arg Ala Arg Phe Ala 

275 145 150 155 160 

277 Leu Asp Ala Met Pro Gly Lys Gin Met Ala lie Asp Ala Asp Leu Asn 

278 165 170 175 

280 Ser Gly Leu lie Asp Asp Lys Glu Ala Lys Lys Arg Arg Ala Ala Leu 

281 180 185 190 

2 83 Ser Gin Glu Ala Asp Phe Tyr Gly Ala Met Asp Gly Ala Ser Lys Phe 

284 195 200 205 

286 Val Lys Gly Asp Ala lie Ala Ser lie lie lie Thr Leu lie Asn lie 

287 210 215 220 

289 lie Gly Gly Phe Leu Val Gly Val Phe Gin Arg Asp Met Ser Leu Ser 

290 225 230 235 240 

292 Phe Ser Ala Ser Thr Phe Thr lie Leu Thr lie Gly Ala Gly Leu Val 

293 245 250 255 

295 Gly Gin He Pro Ala Leu He He Ala Thr Arg Thr Gly He Val Ala 

296 260 265 270 

298 Thr Arg Thr Thr Gin Asn Glu Glu Glu Asp Phe Ala Ser Lys Leu He 

299 275 280 285 

301 Thr Gin Leu Thr Asn Lys Ser Lys Thr Leu Val He Val Gly Ala He 

302 290 295 300 

304 Tyr Cys Phe Cys Thr He Pro Gly Leu Pro Thr Phe Ser Leu Ala Phe 

305 305 310 315 320 

307 Val Gly Ala Leu Phe Leu Phe He Ala Trp Leu He Ser Arg Glu Gly 

308 325 330 335 

310 Lys Asp Gly Leu Leu Thr Lys Leu Glu Asn Tyr Leu Ser Gin Lys Phe 

311 340 345 350 

313 Gly Leu Asp Leu Ser Glu Lys Pro His Ser Ser Lys He Lys Pro His 

314 355 360 365 

316 Ala Pro Thr Thr Arg Ala Lys Thr Gin Glu Glu He Lys Arg Glu Glu 

317 370 375 380 

319 Glu Gin Ala He Asp Glu Val Leu Lys He Glu Phe Leu Glu Leu Ala 

320 385 390 395 400 

322 Leu Gly Thr Gin Leu Tyr Ser Leu Ala Asp Met Lys Gin Gly Gly Asp 

323 405 410 415 

325 Leu Leu Glu Arg He Arg Gly He Arg Lys Lys He Ala Ser Asp Tyr 

326 420 425 430 
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328 Gly Phe Leu Met Pro Gin lie Arg lie Arg Asp Asn Leu Gin Leu Pro 

329 435 440 445 

331 Pro Thr His Tyr Glu lie Lys Leu Lys Gly lie Val lie Gly Glu Gly 

332 450 455 460 

334 Met Val Met Pro Asp Lys Phe Leu Ala Met Asn Thr Gly Phe Val Asn 

335 465 470 475 480 

337 Lys Glu lie Glu Gly lie Pro Thr Lys Glu Pro Ala Phe Gly Met Asp 

338 485 490 495 

340 Ala Leu Trp He Glu Thr Lys Asn Lys Glu Glu Ala He He Gin Gly 

341 500 505 510 

343 Tyr Thr He He Asp Pro Ser Thr Val He Ala Thr His Thr Ser Glu 

344 515 520 525 

346 Leu Val Lys Lys Tyr Ala Glu Asp Phe He Thr Lys Asp Glu Val Lys 

347 530 535 540 

349 Ser Leu Leu Glu Arg Leu Ala Lys Asp Tyr Pro Thr He Val Glu Glu 

350 545 550 555 560 

352 Ser Lys Lys He Pro Thr Gly Ala He Arg Ser Val Leu Gin Ala Leu 

353 565 570 575 

355 Leu His Glu Lys He Pro He Lys Asp Met Leu Thr He Leu Glu Thr 

356 580 585 590 

358 He Thr Asp He Ala Pro Leu Val Gin Asn Asp Val Asn He Leu Thr 

359 595 600 605 

361 Glu Gin Val Arg Ala Arg Leu Ser Arg Val He Thr Asn Ala Phe Lys 

362 610 615 620 

364 Ser Glu Asp Gly Arg Leu Lys Phe Leu Thr Phe Ser Thr Asp Ser Glu 

365 625 630 635 640 

367 Gin Phe Leu Leu Asn Lys Leu Arg Glu Asn Gly Thr Ser Lys Ser Leu 

368 645 650 655 

370 Leu Leu Asn Val Gly Glu Leu Gin Lys Leu He Glu Ala Val Ser Glu 

371 660 665 670 

373 Glu Ala Met Lys Val Leu Gin Lys Gly He Ala Pro Val He Leu He 

374 675 680 685 

376 Val Glu Pro Asn Leu Arg Lys Ala Leu Ser Asn'Gln Met Glu Gin Ala 

377 690 695 700 

379 Arg He Asp Val He Val Leu Ser His Ala Glu Leu Asp Pro Asn Ser 

380 705 710 715 720 

382 Asn Phe Glu Ala Leu Gly Thr He His He Asn Phe 

383 725 730 
385 (2) INFORMATION FOR SEQ ID NO : 8: 

387 (i) SEQUENCE CHARACTERISTICS: 

388 (A) LENGTH: 732 amino acids 

389 (B) TYPE: amino acid 

390 (C) STRANDEDNESS : single 

391 (D) TOPOLOGY: linear 
393 (ii) MOLECULE TYPE: peptide 

398 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8: 

4 00 Met Ala Asn Glu Arg Ser Lys Leu Ala Phe Lys Lys Thr Phe Pro Val 

401 15 10 15 

403 Phe Lys Arg Phe Leu Gin Ser Lys Asp Leu Ala Leu Val Val Phe Val 
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; ft* 
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